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1, INTRODUCTION 

Object depth estimation from 2D image is one of the most active research topic because it’s the 
basic problem in computer vision and has important application in robotics, pattern recognition, graphic and 
machine vision.3D image construction is the challenging problem and many researchers have been performed 
to resolve it. In [1] the author have proposed a supervised learning approach for depth estimation using a 3D 
scanner to collect the training data that used to model a conditional distribution of depth that was gave a 
monocular image features. Several methods are used for detection object presented in [2]. 

Other investigator, [3] have developed a simultaneous phase-shifting technique using an innovating 
colour fringe pattern with multiple triangular modulation for 3D vision system. Shape and depth from 
shading techniques for 3D surface reconstruction were presented in [4, 5]. However shading technique is 
only valid to acquire object height information for the direction associated with the incident light and the 
generated object shadow. 

Many works on 3D construction have been focused on 2D images. In [6] the object was projected 
through a grid of pseudorandom encoded structured light and determines a set of references pixels in two 
simultaneous view of the same object using two cameras. While in [7] the model of 3D object that obtained 
from 2D image are based on human-computer interaction. The human was provided with much visual 
assistance as possible to make a correct input and verify it. 

Constructing a 3D graphical image model was proposed in [8], where six surfaces including front, 
back, left, right, top, and bottom view image were used to form the model and the colour matrixes. 
Reconstruction methods from contours line was provided in [9]. In the model, zhong and coauthors suggested 
to rid some of redundant point on every contour and interpolate them by using cubic Bezier line curve. 
In [10] the author suggested to represent a geometric 3D shape as a probability distribution of binary variable 
on a 3D voxel grid using (Convolution Deep Belief Network). Generating a 3D image from a consecutive of 
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2 D image has been investigated by [11]. 3D face reconstruction has been introduced in [12]. Where the 
author integrated variant face pose in order to reconstruct a 3D image face model. 

The suggested model in this paper is different from others by proposing a new methodology to 
estimate the depth of an object. The model takes advantage of light distribution over an object, where their 
distribution alters based on object depth as well as angle of the incident light. In our model, asset of points in 
assigned in the 2D image that lays inside the object and along the contour. We referred to this point as 
control points. Neural network technique is used to estimate the depth, where object width illumination and 
angle (8) of the incident light are considered. The 2d object image is reconstructed into 3D object using 
Bezier spline surface. The proposed approach was tested on some 2D image and showed tolerable results. 


2. CAMERA-OBJECT SETUP 

In order to get 3D complex object with deferent depth AUTO CAD software has been utilized to 
form various shape with variant depth. The 2D image are captured by linking camera with a target point light 
in which both are located perpendicular to the object at (150) cm. Figure | represent a sketch of the camera- 
object setup structure whereas, Land Rg are the camera height and the depth of the object respectively. 


Camera and target point light 





Figure 1. The implementing structure 


The resolution of the captured image is 640-480. A set of image pre-processing techniques are 
applied including: digital image de noising, thresholiding and edge detection that presented in [13], and the 
contour tracing to extract the edge as a set of connected pixel as presented in [14]. We can also use K-Mean 
classification (KM) to classify defied K group for data [15]. In this paper cylinder object with radius (15) cm 
and height of (40) cm is used as experimental sample to illustrate how to construct the 3D object image 
through applying the proposed stage of the methodology, where figure (2-a) and (2-b) represent the image of 
the used cylinder and its contour respectively. 


3. RESEARCH METHODS 

The proposed method is divided into three main parts, setting the control point, neural network 
technique, and construction of 3-D object via Bezier surface interpolation. The parts are described in the 
Figure 2. 





Figure 2. Cylinder object. (a) The 2D image (b) The contour of image 
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3.1. Set the control point 

The control point setting is a substantial issue of determining the object depth, it consist of two 
steps. In the first step, the object contour is divided into halves, right and left. Figure 3 shows the halves of 
the cylinder’s contour, where each one includes two types of points, the edge and the halves control points. 
The edge control points are located along the object contour while the halve control points are placed in the 
middle of the halves. The spaces between the edge control points are equal to the space of the halves control 
points. Experimentally (10) pixels are convenient distance between control point where it gives an accurate 


trace for object shape. 
ft | - 
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Figure 3. Two halves contour of the cylinder 


The right and the left side of the object contour can be represented by two matrixes Rs and Ls 
respectively, where each element contains the value of (x,y) coordinate for the control point. The right halves 
side: 


Rs= [rj ] 9 un j:l—m. (1) 
The left halves side : 
Ls= [l,i | 1:—n,J:l—m. (2) 


where n is the number of control points while m is the type number of these points and equals 2, edge and 
halves. In the next steps of the setting methods, Rs an Ls are combined in one matrix Ts as shown in 
Equation (3). 


Ts= [Rs Ls] nxx (3) 


K is the number of whole types of the control points and equals 4. 

In our method, a slice will be referred to a set of points that have the same level where each slice 
have 4 control points. These points are connected one to another to construct the slice shape, where Figure 4 
shows the slice shape of the cylinder that has (33) slices of length of (288) pixels. The cylinder ends are 
demonstrated with (12) slice while the cylinder body represent with (21) slice. Obviously the slice that 
constituent the cylinders body are equal width while the slice in the other side are very close compare with 
the slice of the body, in spite of all distance between the cylinder slices are equal to 10 pixel as mentioned 
previously. This is due to the convergence between contours sides (left and right). 
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Figure 4. (a) Cylinder shape; (b) Shape of the object as slices and set of control points 
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3.2. Neural network technique 

The neural network represents a powerful data processing technique that has reached maturity and 
broad application [16, 17]. In this paper the neural network technique is used to estimate the value of depth. 
Here a network with five inputs, two hidden layer and one output is proposed. The hidden layer is composed 
from eight neurons. The height of the slice, the illumination of the slice midpoint, to half control point 
illumination and the incident light angle (8) represent the five input of the network while the depth represent 
the output. The first input portrays the number of the pixels in the slice while the intensity value of the 
midpoint and the two half control are used as second, third and fourth input respectively. The fifth input (9), 
that is shown in Figure 5 can be defined as the angle between the perpendicular incident light on the centered 
midpoint and the incident light on the required midpoint as in Equation (4). 


6 .=tan |! CHEZ k:1—n. (4) 
L 


where ( Yx-Yn/2) 18 vertical distance between the slice midpoint and the cantered mid point and L is the 
perpendicular distance of the incident light on the entered midpoint. 

The proposed network is verified using the matlab program, where back propagation learning 
algorithm is used with 0.05 learning rate. All neurons are set to use a sigmoid activation function. The 
network is trained using 300 sets that selected to cover many depth cases for different curved shape. 
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Figure 5. The angle of midpoint incident light 


3.3. Construct the 3D shape 

Bezier surface are a species of a mathematical spline used in computer graphics, computer-aided 
design and finite element modeling. It is defined by a rectangular grid of control point; it 1s anchored at the 
four corner point and employs the other grid points to determine its shape as pointed in [18-20]. 

In this work the Bezier surface interpolation is utilized to construct the 3D shape where the value of 
the estimated depth of each slice will be used as the z value for the half control point of the same slice while 
the depth of the edge control point will assign to zero Equation (3) 1s divided into three (4*n) sub-matrices, 
[21, 22], where each one has a simple component of coordinates. These matrices are Cx, Cy and C, which 
referred to x, y, and z component respectively. Equation (5) shows the matrix form of x component where y 
and z component can be represented in the same way. 


secl_sec2 sec3 sec4 

x(1,1) x(1,2) x(1,3).....x(1,n) 

x(2,1) x(2,2) x(2,3).....x(2,n) 

x(3,1) x(3,2) x(3,3) .....x(3,7) (5) 
x(4,1) x(4,2) x(4,3).....x(4,n) 

x(5,1) x(5,2) x(5,3).....x(5, 71) 


The 3D shape is constructed using a bi cubic Bezier matrix surface where the bi cubic Bezier matrix 
is given by a grid of (4*4) control point. Obviously, each four slices represent 16 control point or one Bezier 
patch so for an object shape that has more than four slices, a single surface patch does not enough to cover all 
objects details. Hence the object surface will be described through several patch joint together through 
continuity patch to ensure that smoothness of the surface. The main matrix C,,Cy and C; are divided into sub 
matrix where each of them has four slices (16 control point). The number of sub matrix is depending on the 
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number of slices as shown in [23, 24]. The bi cubic Bezier surface can be shown in matrix form through 
equation (6). 


S(u,v)=[U][NILCPI[N ITV] (6) 


Hence the 4*4 matrix [CP] stores the control points, [N] contains the Bernstein polynomial 
coefficients [U] =[u3 u2 u:| and [v3 v2 vi] as mentioned in [25]. The construction of 3-D matrix shape through 
a set of Bezier patches and continuity patch is Clearfield in Figure 6, where the cylinder shape with 33 slices 
are portrayed as a wireframe object that consist from six Bezier patch and five continuity patch. 
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Figure 6. Bezier surface of the cylinder 


4. RESULTS AND ANALYSIS 

In order to evaluate the performance of our methodology we selected four 2-D images of different 
shape including cone, bowel, vase, and sphere. These images contain Varity of curves and depth which 
makes the construction of 3D image more complex and challenged. Figure 7 represent the tested objects 
where the picture of the left column is the original 2-D object image while those of right column are the 
constructed 3D object shape image. The cone and bowl image are represented in the Figure 7(a-L) and (b-L) 
respectively, each of them has different depth value and a total length of 40cm. In the first object, the depth 
of upper ends starts from (25) cm and growth down to (50)cm at the lower end, while the second object, the 
depth starts from (30)cm where it is enlarging to reach (50)cm and then decrease until (15)cm at the end of 
the object. The cone and bowl from (64) slice that translated into a wire frame Bezier surface with (11) 
Bezier patch connected with (10) continuity patch as shown in the Figure 7(aR) and (bR) respectively. 

Figure 7(c-L) shown the image of vase object. It has disparity values of depth between (20-45) cm 
constructed a shape with (56) slices .these slices are translated into (10) Bezier patch connected by (10) 
continuity patches. The whole patches produced a wire-frame Bezier surface of the vase objects with various 
depths as illustrated in Figure 7(c-R). 

Finally the 2-D image of the sphere object with (15) cm radius is shown in Figure 7(d-L). It is 
transformed to a shape of (22) slices with (4) Bezier patches connected by (3) continuity patches as 
represented in Figure 7(d-R). The proposed methodology is evaluated through the experimental results that 
indicate verity value of error depending on complexity of the tested shape. The mean error value E is 
calculated though Equation (7). 


a 


where N is the number of sample and e is the error value of the estimated depth that calculated for each 
sample as illustrated in Equation (8). 


=> Rdj-Edj (8) 
j-o Rai 


where M is the number of slices in the object shape, Rd, Ed are the real depth and the estimated depth of the 
object respectively. For the five tested samples including cylinder, cone, bowl, vase and sphere objects the 
mean error value doesn’t exceed (3%). 
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(b-L) (b-R) 





(d-L) 


Figure 7. The 2d image of cone, bowl, vase and sphere and their 3-d Bezier surface image 


5. CONCLUSTION 

In this paper the depth of an object in a 2D image is estimated and utilized in the 3D shape 
construction. A still camera with a target point light is used to capture the 2D image processing technique to 
get the contour of the object. The object contour is divided in two parts (left and right) sides. Two types of 
control points named edge control point and half control point are located on the both side. The control points 
help in determining the slice shape of the object. The neural network technique is used to get an estimation 
value of for object depth based on a set of inputs value including the length of the slices. Midpoint 
illumination, two half control points illumination and the incident light angle. The Bezier surface is used to 
construct the 3D image shape of the 2D image object based on the estimated depth values. In order to 
evaluate our proposed model, we selected four different objects.( cone, bowl, vase, sphere) where these 
objects are characterized by varying depth along the object shape. As a consequence, the 3D shape estimation 
for these objects is a challenge for our proposed model. The result reveals a beneficial estimation in which 
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the mean error of whole selected objects is 3%. Our model has a potential to be used in robot-grasping task in 
which the geometry of 3D object is reconstructed and then grasping position and orientation could be 
determined. Another potential is by implementing the model in industrial-CNC machine. This could provide 
the machine with image based depth estimation to manufacture in 3D object from a 2D image. 
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